Three different types of input files are used. They are - MAF file- which is gz compressed - Clinical data associated with each sample / Tumor_sample_barcode in MAF (optional) - Copy number data if its available
The read.maf function is utilized for reading and processing MAF files, subsequently storing the information as a MAF object. Additionally, it can accept input for the tumor sample barcode and copy number data. By default, variants with high or moderate consequences are regarded as non-synonymous, but this can be adjusted as required using the vc_nonSyn parameter in read.maf.
A MAF object comprises the MAF file itself, along with summarized information and related sample annotations. To access valuable components within MAF objects, one can employ the accessor methods.
laml
## An object of class MAF
## ID summary Mean Median
## 1: NCBI_Build 37 NA NA
## 2: Center genome.wustl.edu NA NA
## 3: Samples 193 NA NA
## 4: nGenes 1241 NA NA
## 5: Frame_Shift_Del 52 0.269 0
## 6: Frame_Shift_Ins 91 0.472 0
## 7: In_Frame_Del 10 0.052 0
## 8: In_Frame_Ins 42 0.218 0
## 9: Missense_Mutation 1342 6.953 7
## 10: Nonsense_Mutation 103 0.534 0
## 11: Splice_Site 92 0.477 0
## 12: total 1732 8.974 9
#Shows sample summary.
getSampleSummary(laml)
## Tumor_Sample_Barcode Frame_Shift_Del Frame_Shift_Ins In_Frame_Del
## 1: TCGA-AB-3009 0 5 0
## 2: TCGA-AB-2807 1 0 1
## 3: TCGA-AB-2959 0 0 0
## 4: TCGA-AB-3002 0 0 0
## 5: TCGA-AB-2849 0 1 0
## ---
## 189: TCGA-AB-2942 0 0 0
## 190: TCGA-AB-2946 0 0 0
## 191: TCGA-AB-2954 0 0 0
## 192: TCGA-AB-2982 0 0 0
## 193: TCGA-AB-2903 0 0 0
## In_Frame_Ins Missense_Mutation Nonsense_Mutation Splice_Site total
## 1: 1 25 2 1 34
## 2: 0 16 3 4 25
## 3: 0 22 0 1 23
## 4: 0 15 1 5 21
## 5: 0 16 1 2 20
## ---
## 189: 1 0 0 0 1
## 190: 0 1 0 0 1
## 191: 0 1 0 0 1
## 192: 0 1 0 0 1
## 193: 0 0 0 0 0
#Shows gene summary.
getGeneSummary(laml)
## Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins
## 1: FLT3 0 0 1 33
## 2: DNMT3A 4 0 0 0
## 3: NPM1 0 33 0 0
## 4: IDH2 0 0 0 0
## 5: IDH1 0 0 0 0
## ---
## 1237: ZNF689 0 0 0 0
## 1238: ZNF75D 0 0 0 0
## 1239: ZNF827 1 0 0 0
## 1240: ZNF99 0 0 0 0
## 1241: ZPBP 0 0 0 0
## Missense_Mutation Nonsense_Mutation Splice_Site total MutatedSamples
## 1: 15 0 3 52 52
## 2: 39 5 6 54 48
## 3: 1 0 0 34 33
## 4: 20 0 0 20 20
## 5: 18 0 0 18 18
## ---
## 1237: 1 0 0 1 1
## 1238: 1 0 0 1 1
## 1239: 0 0 0 1 1
## 1240: 1 0 0 1 1
## 1241: 1 0 0 1 1
## AlteredSamples
## 1: 52
## 2: 48
## 3: 33
## 4: 20
## 5: 18
## ---
## 1237: 1
## 1238: 1
## 1239: 1
## 1240: 1
## 1241: 1
#shows clinical data associated with samples
getClinicalData(laml)
## Tumor_Sample_Barcode FAB_classification days_to_last_followup
## 1: TCGA-AB-2802 M4 365
## 2: TCGA-AB-2803 M3 792
## 3: TCGA-AB-2804 M3 2557
## 4: TCGA-AB-2805 M0 577
## 5: TCGA-AB-2806 M1 945
## ---
## 189: TCGA-AB-3007 M3 1581
## 190: TCGA-AB-3008 M1 822
## 191: TCGA-AB-3009 M4 577
## 192: TCGA-AB-3011 M1 1885
## 193: TCGA-AB-3012 M3 1887
## Overall_Survival_Status
## 1: 1
## 2: 1
## 3: 0
## 4: 1
## 5: 1
## ---
## 189: 0
## 190: 1
## 191: 1
## 192: 0
## 193: 0
#Shows all fields in MAF
getFields(laml)
## [1] "Hugo_Symbol" "Entrez_Gene_Id" "Center"
## [4] "NCBI_Build" "Chromosome" "Start_Position"
## [7] "End_Position" "Strand" "Variant_Classification"
## [10] "Variant_Type" "Reference_Allele" "Tumor_Seq_Allele1"
## [13] "Tumor_Seq_Allele2" "Tumor_Sample_Barcode" "Protein_Change"
## [16] "i_TumorVAF_WU" "i_transcript_name"
#Writes maf summary to an output file with basename laml.
write.mafSummary(maf = laml, basename = 'laml')
The plotmafSummary function enables summarization of the MAF file, presenting the variant count per sample through a stacked barplot, and exhibiting variant types via box plots.
For a simplified barplot of mutated genes, we use the mafbarplot function.
plotmafSummary(maf = laml, rmOutlier = TRUE, addStat = 'median', dashboard = TRUE, titvRaw = FALSE)
Oncoplots and waterfall plots are better options for a better visualization.
NOTE: Variants annotated as Multi_Hit are those genes which are mutated more than once in the same sample.
#oncoplot for top ten mutated genes.
oncoplot(maf = laml, top = 10)
The titv function categorizes Transitions and Transversions in SNPs, providing a summary table in diverse formats. This data can be depicted as a layered barplot that shows the conversion percentages for each sample, as well as a boxplot that illustrates the distribution of six unique conversion types.
laml.titv = titv(maf = laml, plot = FALSE, useSyn = TRUE)
#plot titv summary
plotTiTv(res = laml.titv)
To employ the lollipopPlot function, information about amino acid alterations in the MAF file is necessary. However, the field (or column) names representing these changes in MAF files differ across studies due to the absence of standardized guidelines. By default, lollipopPlot looks for the AAChange column in the MAF file and, if unavailable, issues a warning while displaying all accessible fields. In the example provided, the MAF file records amino acid modifications in a field or column titled “Protein_Change.” To specify this explicitly, we will use the AACol parameter.
#lollipop plot for DNMT3A, which is one of the most frequent mutated gene in Leukemia.
lollipopPlot(
maf = laml,
gene = 'DNMT3A',
AACol = 'Protein_Change',
showMutationRate = TRUE,
labelPos = 882
)
## 3 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## 1: DNMT3A NM_022552 NP_072046 912
## 2: DNMT3A NM_153759 NP_715640 723
## 3: DNMT3A NM_175629 NP_783328 912
## Using longer transcript NM_022552 for now.
## Removed 3 mutations for which AA position was not available
plotProtein(gene = "TP53", refSeqID = "NM_000546")
Cancer genomes, especially solid tumors, frequently exhibit genomic regions with concentrated hyper-mutations5. Visualizing the distance between variants on a linear genomic scale can help identify these hyper-mutated areas. These visualizations are commonly called rainfall plots, which can be generated using the rainfallPlot function. By setting the detectChangePoints parameter to TRUE, the rainfall plot can also highlight potential regions of change in the distances between events
brca <- system.file("extdata", "brca.maf.gz", package = "maftools")
brca = read.maf(maf = brca, verbose = FALSE)
rainfallPlot(maf = brca, detectChangePoints = TRUE, pointSize = 0.4)
## Processing TCGA-A8-A08B..
## Kataegis detected at:
## Chromosome Start_Position End_Position nMuts Avg_intermutation_dist Size
## 1: 8 98129348 98133560 7 702.0000 4212
## 2: 8 98398549 98403536 9 623.3750 4987
## 3: 8 98453076 98456466 9 423.7500 3390
## 4: 8 124090377 124096810 22 306.3333 6433
## 5: 12 97436055 97439705 7 608.3333 3650
## 6: 17 29332072 29336153 8 583.0000 4081
## Tumor_Sample_Barcode C>G C>T
## 1: TCGA-A8-A08B 4 3
## 2: TCGA-A8-A08B 1 8
## 3: TCGA-A8-A08B 1 8
## 4: TCGA-A8-A08B 1 21
## 5: TCGA-A8-A08B 4 3
## 6: TCGA-A8-A08B 4 4
To contrast mutation burden with 33 TCGA cohorts, the tcgaCompare function utilizes mutation data from TCGA MC3. The resulting plot bears a resemblance to the one presented by Alexandrov et al.5.
laml.mutload = tcgaCompare(maf = laml, cohortName = 'Example-LAML', logscale = TRUE, capture_size = 50)
## Warning in FUN(X[[i]], ...): Removed 1 samples with zero mutations.
## Capture size [TCGA]: 35.8
## Capture size [Input]: 50
## Performing pairwise t-test for differences in mutation burden (per MB)..
For a rapid assessment of the clonal status of commonly mutated genes, this utility presents variant allele frequencies in the form of a boxplot. Clonal genes generally exhibit mean allele frequencies near 50%, assuming a pure sample.
plotVaf(maf = laml, vafCol = 'i_TumorVAF_WU')
The somaticInteractions function can be employed to detect sets of genes that are either mutually exclusive or co-occurring. This is achieved through a pair-wise Fisher’s exact test, which identifies significant relationships between these genes.
#exclusive/co-occurance event analysis on top 10 mutated genes.
somaticInteractions(maf = laml, top = 25, pvalue = c(0.05, 0.1))
## gene1 gene2 pValue oddsRatio 00 11 01 10 pAdj
## 1: ASXL1 RUNX1 0.0001541586 55.215541 176 4 12 1 0.003568486
## 2: IDH2 RUNX1 0.0002809928 9.590877 164 7 9 13 0.006055880
## 3: IDH2 ASXL1 0.0004030636 41.077327 172 4 1 16 0.008126283
## 4: FLT3 NPM1 0.0009929836 3.763161 125 17 16 35 0.018664260
## 5: SMC3 DNMT3A 0.0010451985 20.177713 144 6 42 1 0.018664260
## ---
## 296: PLCE1 ASXL1 1.0000000000 0.000000 184 0 5 4 1.000000000
## 297: RAD21 FAM5C 1.0000000000 0.000000 183 0 5 5 1.000000000
## 298: PLCE1 FAM5C 1.0000000000 0.000000 184 0 5 4 1.000000000
## 299: PLCE1 RAD21 1.0000000000 0.000000 184 0 5 4 1.000000000
## 300: EZH2 PLCE1 1.0000000000 0.000000 186 0 4 3 1.000000000
## Event pair event_ratio
## 1: Co_Occurence ASXL1, RUNX1 4/13
## 2: Co_Occurence IDH2, RUNX1 7/22
## 3: Co_Occurence ASXL1, IDH2 4/17
## 4: Co_Occurence FLT3, NPM1 17/51
## 5: Co_Occurence DNMT3A, SMC3 6/43
## ---
## 296: Mutually_Exclusive ASXL1, PLCE1 0/9
## 297: Mutually_Exclusive FAM5C, RAD21 0/10
## 298: Mutually_Exclusive FAM5C, PLCE1 0/9
## 299: Mutually_Exclusive PLCE1, RAD21 0/9
## 300: Mutually_Exclusive EZH2, PLCE1 0/7
Maftools includes a feature called Oncodrive, which identifies cancer-causing genes (drivers) within a given MAF file. This feature is grounded in the oncodriveCLUST algorithm’s Python implementation. The underlying concept stems from the observation that most gene mutations related to cancer tend to concentrate at a limited number of specific sites, also known as hotspots. Oncodrive leverages these hotspots to detect cancer-related genes.
laml.sig = oncodrive(maf = laml, AACol = 'Protein_Change', minMut = 5, pvalMethod = 'zscore')
## Warning in oncodrive(maf = laml, AACol = "Protein_Change", minMut = 5,
## pvalMethod = "zscore"): Oncodrive has been superseeded by OncodriveCLUSTL. See
## http://bg.upf.edu/group/projects/oncodrive-clust.php
## Estimating background scores from synonymous variants..
## Not enough genes to build background. Using predefined values. (Mean = 0.279; SD = 0.13)
## Estimating cluster scores from non-syn variants..
##
|
| | 0%
|
|=== | 4%
|
|====== | 9%
|
|========= | 13%
|
|============ | 17%
|
|=============== | 22%
|
|================== | 26%
|
|===================== | 30%
|
|======================== | 35%
|
|=========================== | 39%
|
|============================== | 43%
|
|================================= | 48%
|
|===================================== | 52%
|
|======================================== | 57%
|
|=========================================== | 61%
|
|============================================== | 65%
|
|================================================= | 70%
|
|==================================================== | 74%
|
|======================================================= | 78%
|
|========================================================== | 83%
|
|============================================================= | 87%
|
|================================================================ | 91%
|
|=================================================================== | 96%
|
|======================================================================| 100%
## Comapring with background model and estimating p-values..
## Done !
head(laml.sig)
## Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins
## 1: IDH1 0 0 0 0
## 2: IDH2 0 0 0 0
## 3: NPM1 0 33 0 0
## 4: NRAS 0 0 0 0
## 5: U2AF1 0 0 0 0
## 6: KIT 1 1 0 1
## Missense_Mutation Nonsense_Mutation Splice_Site total MutatedSamples
## 1: 18 0 0 18 18
## 2: 20 0 0 20 20
## 3: 1 0 0 34 33
## 4: 15 0 0 15 15
## 5: 8 0 0 8 8
## 6: 7 0 0 10 8
## AlteredSamples clusters muts_in_clusters clusterScores protLen zscore
## 1: 18 1 18 1.0000000 414 5.546154
## 2: 20 2 20 1.0000000 452 5.546154
## 3: 33 2 32 0.9411765 294 5.093665
## 4: 15 2 15 0.9218951 189 4.945347
## 5: 8 1 7 0.8750000 240 4.584615
## 6: 8 2 9 0.8500000 976 4.392308
## pval fdr fract_muts_in_clusters
## 1: 1.460110e-08 1.022077e-07 1.0000000
## 2: 1.460110e-08 1.022077e-07 1.0000000
## 3: 1.756034e-07 8.194826e-07 0.9411765
## 4: 3.800413e-07 1.330144e-06 1.0000000
## 5: 2.274114e-06 6.367520e-06 0.8750000
## 6: 5.607691e-06 1.308461e-05 0.9000000
plotOncodrive(res = laml.sig, fdrCutOff = 0.1, useFraction = TRUE, labelSize = 0.5)
Maftools incorporates the pfamDomains function, which appends pfam domain data to amino acid alterations. Additionally, pfamDomain provides a summary of amino acid changes based on the impacted domains. This insight assists in determining which domain is predominantly affected in a cancer cohort.
laml.pfam = pfamDomains(maf = laml, AACol = 'Protein_Change', top = 10)
## Warning in pfamDomains(maf = laml, AACol = "Protein_Change", top = 10): Removed
## 50 mutations for which AA position was not available
#Protein summary (Printing first 7 columns for display convenience)
laml.pfam$proteinSummary[,1:7, with = FALSE]
## HGNC AAPos Variant_Classification N total fraction DomainLabel
## 1: DNMT3A 882 Missense_Mutation 27 54 0.5000000 AdoMet_MTases
## 2: IDH1 132 Missense_Mutation 18 18 1.0000000 PTZ00435
## 3: IDH2 140 Missense_Mutation 17 20 0.8500000 PTZ00435
## 4: FLT3 835 Missense_Mutation 14 52 0.2692308 PKc_like
## 5: FLT3 599 In_Frame_Ins 10 52 0.1923077 PKc_like
## ---
## 1512: ZNF646 875 Missense_Mutation 1 1 1.0000000 <NA>
## 1513: ZNF687 554 Missense_Mutation 1 2 0.5000000 <NA>
## 1514: ZNF687 363 Missense_Mutation 1 2 0.5000000 <NA>
## 1515: ZNF75D 5 Missense_Mutation 1 1 1.0000000 <NA>
## 1516: ZNF827 427 Frame_Shift_Del 1 1 1.0000000 <NA>
#Domain summary (Printing first 3 columns for display convenience)
laml.pfam$domainSummary[,1:3, with = FALSE]
## DomainLabel nMuts nGenes
## 1: PKc_like 55 5
## 2: PTZ00435 38 2
## 3: AdoMet_MTases 33 1
## 4: 7tm_1 24 24
## 5: COG5048 17 17
## ---
## 499: ribokinase 1 1
## 500: rim_protein 1 1
## 501: sigpep_I_bact 1 1
## 502: trp 1 1
## 503: zf-BED 1 1
The mafSurvival function carries out survival analysis and generates Kaplan-Meier curves by grouping samples according to the mutation status of user-specified genes or manually provided samples. This function necessitates input data containing a tumor sample barcode, a binary event, and time to the event.
#Survival analysis based on grouping of DNMT3A mutation status
mafSurvival(maf = laml, genes = 'DNMT3A', time = 'days_to_last_followup', Status = 'Overall_Survival_Status', isTCGA = TRUE)
## Looking for clinical data in annoatation slot of MAF..
## Number of mutated samples for given genes:
## DNMT3A
## 48
## Removed 11 samples with NA's
## Median survival..
## Group medianTime N
## 1: Mutant 245 45
## 2: WT 396 137
identifies genes with poor survival rate
#Using top 20 mutated genes to identify a set of genes (of size 2) to predict poor prognostic groups
prog_geneset = survGroup(maf = laml, top = 20, geneSetSize = 2, time = "days_to_last_followup", Status = "Overall_Survival_Status", verbose = FALSE)
## Removed 11 samples with NA's
print(prog_geneset)
## Gene_combination P_value hr WT Mutant
## 1: FLT3_DNMT3A 0.00104 2.510 164 18
## 2: DNMT3A_SMC3 0.04880 2.220 176 6
## 3: DNMT3A_NPM1 0.07190 1.720 166 16
## 4: DNMT3A_TET2 0.19600 1.780 176 6
## 5: FLT3_TET2 0.20700 1.860 177 5
## 6: NPM1_IDH1 0.21900 0.495 176 6
## 7: DNMT3A_IDH1 0.29300 1.500 173 9
## 8: IDH2_RUNX1 0.31800 1.580 176 6
## 9: FLT3_NPM1 0.53600 1.210 165 17
## 10: DNMT3A_IDH2 0.68000 0.747 178 4
## 11: DNMT3A_NRAS 0.99200 0.986 178 4
mafSurvGroup(maf = laml, geneSet = c("DNMT3A", "FLT3"), time = "days_to_last_followup", Status = "Overall_Survival_Status")
## Looking for clinical data in annoatation slot of MAF..
## Removed 11 samples with NA's
## Median survival..
## Group medianTime N
## 1: Mutant 242.5 18
## 2: WT 379.5 164
A recent study by Madan et al (9) revealed that PML and RARA gene mutations occurred in patients experiencing a relapse of Acute Promyelocytic Leukemia (APL) but were not detected during the initial stage of the illness. This finding implies that mutation patterns may differ among various cancer cohorts. To pinpoint differentially mutated genes between two cohorts, the mafCompare function performs a Fisher test on all genes.
#Primary APL MAF
primary.apl = system.file("extdata", "APL_primary.maf.gz", package = "maftools")
primary.apl = read.maf(maf = primary.apl)
## -Reading
## -Validating
## --Non MAF specific values in Variant_Classification column:
## ITD
## -Silent variants: 45
## -Summarizing
## -Processing clinical data
## --Missing clinical data
## -Finished in 0.106s elapsed (0.092s cpu)
#Relapse APL MAF
relapse.apl = system.file("extdata", "APL_relapse.maf.gz", package = "maftools")
relapse.apl = read.maf(maf = relapse.apl)
## -Reading
## -Validating
## --Non MAF specific values in Variant_Classification column:
## ITD
## -Silent variants: 19
## -Summarizing
## -Processing clinical data
## --Missing clinical data
## -Finished in 0.082s elapsed (0.077s cpu)
#Considering only genes which are mutated in at-least in 5 samples in one of the cohort to avoid bias due to genes mutated in single sample.
pt.vs.rt <- mafCompare(m1 = primary.apl, m2 = relapse.apl, m1Name = 'Primary', m2Name = 'Relapse', minMut = 5)
print(pt.vs.rt)
## $results
## Hugo_Symbol Primary Relapse pval or ci.up ci.low
## 1: PML 1 11 1.529935e-05 0.03537381 0.2552937 0.000806034
## 2: RARA 0 7 2.574810e-04 0.00000000 0.3006159 0.000000000
## 3: RUNX1 1 5 1.310500e-02 0.08740567 0.8076265 0.001813280
## 4: FLT3 26 4 1.812779e-02 3.56086275 14.7701728 1.149009169
## 5: ARID1B 5 8 2.758396e-02 0.26480490 0.9698686 0.064804160
## 6: WT1 20 14 2.229087e-01 0.60619329 1.4223101 0.263440988
## 7: KRAS 6 1 4.334067e-01 2.88486293 135.5393108 0.337679367
## 8: NRAS 15 4 4.353567e-01 1.85209500 8.0373994 0.553883512
## 9: ARID1A 7 4 7.457274e-01 0.80869223 3.9297309 0.195710173
## adjPval
## 1: 0.0001376942
## 2: 0.0011586643
## 3: 0.0393149868
## 4: 0.0407875250
## 5: 0.0496511201
## 6: 0.3343630535
## 7: 0.4897762916
## 8: 0.4897762916
## 9: 0.7457273717
##
## $SampleSummary
## Cohort SampleSize
## 1: Primary 124
## 2: Relapse 58
forestPlot(mafCompareRes = pt.vs.rt, pVal = 0.1)
genes = c("PML", "RARA", "RUNX1", "ARID1B", "FLT3")
coOncoplot(m1 = primary.apl, m2 = relapse.apl, m1Name = 'PrimaryAPL', m2Name = 'RelapseAPL', genes = genes, removeNonMutated = TRUE)
coBarplot(m1 = primary.apl, m2 = relapse.apl, m1Name = "Primary", m2Name = "Relapse")
lollipopPlot2(m1 = primary.apl, m2 = relapse.apl, gene = "PML", AACol1 = "amino_acid_change", AACol2 = "amino_acid_change", m1_name = "Primary", m2_name = "Relapse")
## Gene: PML
## 9 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## 1: PML NM_002675 NP_002666 633
## 2: PML NM_033238 NP_150241 882
## 3: PML NM_033239 NP_150242 829
## 4: PML NM_033240 NP_150243 611
## 5: PML NM_033244 NP_150247 560
## 6: PML NM_033246 NP_150249 423
## 7: PML NM_033247 NP_150250 435
## 8: PML NM_033249 NP_150252 585
## 9: PML NM_033250 NP_150253 781
## Using longer transcript NM_033238 for now.
## 9 transcripts available. Use arguments refSeqID or proteinID to manually specify tx name.
## HGNC refseq.ID protein.ID aa.length
## 1: PML NM_002675 NP_002666 633
## 2: PML NM_033238 NP_150241 882
## 3: PML NM_033239 NP_150242 829
## 4: PML NM_033240 NP_150243 611
## 5: PML NM_033244 NP_150247 560
## 6: PML NM_033246 NP_150249 423
## 7: PML NM_033247 NP_150250 435
## 8: PML NM_033249 NP_150252 585
## 9: PML NM_033250 NP_150253 781
## Using longer transcript NM_033238 for now.
The “clinicalEnrichment” function aims to perform enrichment analysis on clinical attributes associated with samples. By comparing mutations across various categories within a clinical characteristic, it identifies potential enrichments through both groupwise and pairwise comparisons. For example, the function can be utilized to discover mutations connected to FAB_classification.
fab.ce = clinicalEnrichment(maf = laml, clinicalFeature = 'FAB_classification')
## Sample size per factor in FAB_classification:
##
## M0 M1 M2 M3 M4 M5 M6 M7
## 19 44 44 21 39 19 3 3
#Results are returned as a list. Significant associations p-value < 0.05
fab.ce$groupwise_comparision[p_value < 0.05]
## Hugo_Symbol Group1 Group2 n_mutated_group1 n_mutated_group2 p_value
## 1: IDH1 M1 Rest 11 of 44 7 of 149 0.0002597371
## 2: TP53 M7 Rest 3 of 3 12 of 190 0.0003857187
## 3: DNMT3A M5 Rest 10 of 19 38 of 174 0.0089427384
## 4: CEBPA M2 Rest 7 of 44 6 of 149 0.0117352110
## 5: RUNX1 M0 Rest 5 of 19 11 of 174 0.0117436825
## 6: NPM1 M5 Rest 7 of 19 26 of 174 0.0248582372
## 7: NPM1 M3 Rest 0 of 21 33 of 172 0.0278630823
## 8: DNMT3A M3 Rest 1 of 21 47 of 172 0.0294005111
## OR OR_low OR_high fdr
## 1: 6.670592 2.173829026 21.9607250 0.0308575
## 2: Inf 5.355415451 Inf 0.0308575
## 3: 3.941207 1.333635173 11.8455979 0.3757978
## 4: 4.463237 1.204699322 17.1341278 0.3757978
## 5: 5.216902 1.243812880 19.4051505 0.3757978
## 6: 3.293201 1.001404899 10.1210509 0.5880102
## 7: 0.000000 0.000000000 0.8651972 0.5880102
## 8: 0.133827 0.003146708 0.8848897 0.5880102
plotEnrichmentResults(enrich_res = fab.ce, pVal = 0.05, geneFontSize = 0.5, annoFontSize = 0.6)
The “drugInteractions” function analyzes the interplay between drugs and genes, in addition to exploring the potential for genes to be targeted by drugs, utilizing data sourced from the Drug Gene Interaction database.
dgi = drugInteractions(maf = laml, fontSize = 0.75)
The illustration presents diverse groups of genes that may be targeted by drugs, as well as the five most prominent genes linked to each group. Furthermore, it allows for the retrieval of drug-gene interaction specifics, like the collection of recognized drugs that engage with DNMT3A.
dnmt3a.dgi = drugInteractions(genes = "DNMT3A", drugs = TRUE)
## Number of claimed drugs for given genes:
## Gene N
## 1: DNMT3A 7
#Printing selected columns.
dnmt3a.dgi[,.(Gene, interaction_types, drug_name, drug_claim_name)]
## Gene interaction_types drug_name drug_claim_name
## 1: DNMT3A N/A
## 2: DNMT3A DAUNORUBICIN Daunorubicin
## 3: DNMT3A DECITABINE Decitabine
## 4: DNMT3A IDARUBICIN IDARUBICIN
## 5: DNMT3A DECITABINE DECITABINE
## 6: DNMT3A inhibitor DECITABINE CHEMBL1201129
## 7: DNMT3A inhibitor AZACITIDINE CHEMBL1489
The OncogenicPathways function evaluates whether TCGA cohorts11 exhibit a significant rise in the prevalence of well-known Oncogenic Signaling Pathways.
OncogenicPathways(maf = laml)
## Pathway N n_affected_genes fraction_affected Mutated_samples
## 1: PI3K 29 1 0.03448276 1
## 2: NRF2 3 1 0.33333333 1
## 3: TP53 6 2 0.33333333 15
## 4: WNT 68 3 0.04411765 4
## 5: MYC 13 3 0.23076923 3
## 6: NOTCH 71 6 0.08450704 8
## 7: Hippo 38 7 0.18421053 7
## 8: RTK-RAS 85 18 0.21176471 97
## Fraction_mutated_samples
## 1: 0.005181347
## 2: 0.005181347
## 3: 0.077720207
## 4: 0.020725389
## 5: 0.015544041
## 6: 0.041450777
## 7: 0.036269430
## 8: 0.502590674
PlotOncogenicPathways(maf = laml, pathways = "RTK-RAS")
##bibliography
Mayakonda, Anand, De-Chen Lin, Yassen Assenov, Christoph Plass, and H. Phillip Koeffler. 2018. “Maftools: Efficient and Comprehensive Analysis of Somatic Variants in Cancer.” Genome Research 28 (11): 1747–56..
Network, Cancer Genome Atlas Research, Timothy J. Ley, Christopher Miller, Li Ding, Benjamin J. Raphael, Andrew J. Mungall, A. Gordon Robertson, et al. 2013. “Genomic and Epigenomic Landscapes of Adult de Novo Acute Myeloid Leukemia.” The New England Journal of Medicine 368 (22): 2059–74..